Informal version for personal use Scalable Clustering

نویسندگان

  • Joydeep Ghosh
  • Nong Ye
چکیده

2 Clustering Techniques: A Brief Survey 4 2.1 Partitional Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Hierarchical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Discriminative vs. Generative Models . . . . . . . . . . . . . . . . . 12 2.4 Assessment of Results . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4.1 Internal (model-based, unsupervised) Quality . . . . . . . . . 13 2.4.2 External (model-free, semi-supervised) Quality . . . . . . . . 14 2.5 Visualization of Results . . . . . . . . . . . . . . . . . . . . . . . . . 16

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Merging Similarity and Trust Based Social Networks to Enhance the Accuracy of Trust-Aware Recommender Systems

In recent years, collaborative filtering (CF) methods are important and widely accepted techniques are available for recommender systems. One of these techniques is user based that produces useful recommendations based on the similarity by the ratings of likeminded users. However, these systems suffer from several inherent shortcomings such as data sparsity and cold start problems. With the dev...

متن کامل

Using fuzzy c-means clustering algorithm for common lecturer timetabling among departments

University course timetabling problem is one of the hard problems and it must be done for each term frequently which is an exhausting and time consuming task. The main technique in the presented approach is focused on developing and making the process of timetabling common lecturers among different departments of a university scalable. The aim of this paper is to improve the satisfaction of com...

متن کامل

Scalable techniques for clustering the web pdf

Scalable Clustering.and text mining, spatial database applications, Web analysis, CRM, marketing. Powerful broadly applicable data mining clustering methods surveyed below. Since scalability is the major achievement of this blend strategy, this algorithm is.Using typical document clustering techniques on Web opinions produce unsatisfying result. In this work, we propose the scalable distance-ba...

متن کامل

خوشه‌بندی داده‌ها بر پایه شناسایی کلید

Clustering has been one of the main building blocks in the fields of machine learning and computer vision. Given a pair-wise distance measure, it is challenging to find a proper way to identify a subset of representative exemplars and its associated cluster structures. Recent trend on big data analysis poses a more demanding requirement on new clustering algorithm to be both scalable and accura...

متن کامل

Exploiting parallelism to support scalable hierarchical clustering

A distributed memory parallel version of the group average Hierarchical Agglomerative Clustering algorithm is proposed to enable scaling the document clustering problem to large collections. Using standard message passing operations reduces interprocess communication while maintaining efficient load balancing. In a series of experiments using a subset of a standard TREC test collection, our par...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003